Autogenerated HTML docs for v1.8.5.3-321-g14598 
diff --git a/technical/pack-heuristics.html b/technical/pack-heuristics.html index b697ea7..3186ac0 100644 --- a/technical/pack-heuristics.html +++ b/technical/pack-heuristics.html 
@@ -3,8 +3,8 @@  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">   <head>   <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />  -<meta name="generator" content="AsciiDoc 8.6.8" />  -<title></title>  +<meta name="generator" content="AsciiDoc 8.6.6" />  +<title>Concerning Git&#8217;s Packing Heuristics</title>   <style type="text/css">   /* Shared CSS for AsciiDoc xhtml11 and html5 backends */    @@ -87,15 +87,11 @@  ul > li { color: #aaa; }   ul > li > * { color: black; }    -.monospaced, code, pre {  - font-family: "Courier New", Courier, monospace;  - font-size: inherit;  - color: navy;  +pre {   padding: 0;   margin: 0;   }    -   #author {   color: #527bbd;   font-weight: bold;  @@ -353,7 +349,7 @@  margin-bottom: 0.1em;   }    -div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {  +div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {   margin-top: 0;   margin-bottom: 0;   }  @@ -411,14 +407,18 @@  span.overline { text-decoration: overline; }   span.line-through { text-decoration: line-through; }    -div.unbreakable { page-break-inside: avoid; }  -     /*   * xhtml11 specific   *   * */    +tt {  + font-family: monospace;  + font-size: inherit;  + color: navy;  +}  +   div.tableblock {   margin-top: 1.0em;   margin-bottom: 1.5em;  @@ -452,6 +452,12 @@  *   * */    +.monospaced {  + font-family: monospace;  + font-size: inherit;  + color: navy;  +}  +   table.tableblock {   margin-top: 1.0em;   margin-bottom: 1.5em;  @@ -531,8 +537,6 @@  @media print {   body.manpage div#toc { display: none; }   }  -  -   </style>   <script type="text/javascript">   /*<![CDATA[*/  @@ -577,7 +581,7 @@    function tocEntries(el, toclevels) {   var result = new Array;  - var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');  + var re = new RegExp('[hH]([2-'+(toclevels+1)+'])');   // Function that scans the DOM tree for header elements (the DOM2   // nodeIterator API would be a better technique but not supported by all   // browsers).  @@ -606,7 +610,7 @@  var i;   for (i = 0; i < toc.childNodes.length; i++) {   var entry = toc.childNodes[i];  - if (entry.nodeName.toLowerCase() == 'div'  + if (entry.nodeName == 'div'   && entry.getAttribute("class")   && entry.getAttribute("class").match(/^toclevel/))   tocEntriesToRemove.push(entry);  @@ -652,7 +656,7 @@  var entriesToRemove = [];   for (i = 0; i < noteholder.childNodes.length; i++) {   var entry = noteholder.childNodes[i];  - if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")  + if (entry.nodeName == 'div' && entry.getAttribute("class") == "footnote")   entriesToRemove.push(entry);   }   for (i = 0; i < entriesToRemove.length; i++) {  @@ -731,22 +735,20 @@  </head>   <body class="article">   <div id="header">  +<h1>Concerning Git&#8217;s Packing Heuristics</h1>   </div>   <div id="content">  +<div id="preamble">  +<div class="sectionbody">   <div class="literalblock">   <div class="content">  -<pre><code>Concerning Git's Packing Heuristics  -===================================</code></pre>  +<pre><tt>Oh, here's a really stupid question:</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Oh, here's a really stupid question:</code></pre>  -</div></div>  -<div class="literalblock">  -<div class="content">  -<pre><code> Where do I go  +<pre><tt> Where do I go   to learn the details  -of Git's packing heuristics?</code></pre>  +of Git's packing heuristics?</tt></pre>   </div></div>   <div class="paragraph"><p>Be careful what you ask!</p></div>   <div class="paragraph"><p>Followers of the Git, please open the Git IRC Log and turn to  @@ -757,11 +759,11 @@  <div class="paragraph"><p>Let&#8217;s listen in!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Oh, here's a really stupid question -- where do I go to  +<pre><tt>&lt;njs`&gt; Oh, here's a really stupid question -- where do I go to   learn the details of Git's packing heuristics? google avails   me not, reading the source didn't help a lot, and wading   through the whole mailing list seems less efficient than any  - of that.</code></pre>  + of that.</tt></pre>   </div></div>   <div class="paragraph"><p>It is a bold start! A plea for help combined with a simultaneous   tri-part attack on some of the tried and true mainstays in the quest  @@ -770,65 +772,65 @@  Woe.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;pasky&gt; yes, the packing-related delta stuff is somewhat  - mysterious even for me ;)</code></pre>  +<pre><tt>&lt;pasky&gt; yes, the packing-related delta stuff is somewhat  + mysterious even for me ;)</tt></pre>   </div></div>   <div class="paragraph"><p>Ah! Modesty after all.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; njs, I don't think the docs exist. That's something where  +<pre><tt>&lt;linus&gt; njs, I don't think the docs exist. That's something where   I don't think anybody else than me even really got involved.   Most of the rest of Git others have been busy with (especially  - Junio), but packing nobody touched after I did it.</code></pre>  + Junio), but packing nobody touched after I did it.</tt></pre>   </div></div>   <div class="paragraph"><p>It&#8217;s cryptic, yet vague. Linus in style for sure. Wise men   interpret this as an apology. A few argue it is merely a   statement of fact.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; I guess the next step is "read the source again", but I  - have to build up a certain level of gumption first :-)</code></pre>  +<pre><tt>&lt;njs`&gt; I guess the next step is "read the source again", but I  + have to build up a certain level of gumption first :-)</tt></pre>   </div></div>   <div class="paragraph"><p>Indeed! On both points.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; The packing heuristic is actually really really simple.</code></pre>  +<pre><tt>&lt;linus&gt; The packing heuristic is actually really really simple.</tt></pre>   </div></div>   <div class="paragraph"><p>Bait&#8230;</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; But strange.</code></pre>  +<pre><tt>&lt;linus&gt; But strange.</tt></pre>   </div></div>   <div class="paragraph"><p>And switch. That ought to do it!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Remember: Git really doesn't follow files. So what it does is  +<pre><tt>&lt;linus&gt; Remember: Git really doesn't follow files. So what it does is   - generate a list of all objects   - sort the list according to magic heuristics   - walk the list, using a sliding window, seeing if an object   can be diffed against another object in the window  - - write out the list in recency order</code></pre>  + - write out the list in recency order</tt></pre>   </div></div>   <div class="paragraph"><p>The traditional understatement:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; I suspect that what I'm missing is the precise definition of  - the word "magic"</code></pre>  +<pre><tt>&lt;njs`&gt; I suspect that what I'm missing is the precise definition of  + the word "magic"</tt></pre>   </div></div>   <div class="paragraph"><p>The traditional insight:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;pasky&gt; yes</code></pre>  +<pre><tt>&lt;pasky&gt; yes</tt></pre>   </div></div>   <div class="paragraph"><p>And Babel-like confusion flowed.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; oh, hmm, and I'm not sure what this sliding window means either</code></pre>  +<pre><tt>&lt;njs`&gt; oh, hmm, and I'm not sure what this sliding window means either</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;pasky&gt; iirc, it appeared to me to be just the sha1 of the object  - when reading the code casually ...</code></pre>  +<pre><tt>&lt;pasky&gt; iirc, it appeared to me to be just the sha1 of the object  + when reading the code casually ...</tt></pre>   </div></div>   <div class="olist lowerroman"><ol class="lowerroman">   <li>  @@ -837,83 +839,83 @@  </p>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; .....and recency order. okay, I think it's clear I didn't  - even realize how much I wasn't realizing :-)</code></pre>  +<pre><tt>&lt;njs`&gt; .....and recency order. okay, I think it's clear I didn't  + even realize how much I wasn't realizing :-)</tt></pre>   </div></div>   </li>   </ol></div>   <div class="paragraph"><p>Ah, grasshopper! And thus the enlightenment begins anew.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; The "magic" is actually in theory totally arbitrary.  +<pre><tt>&lt;linus&gt; The "magic" is actually in theory totally arbitrary.   ANY order will give you a working pack, but no, it's not  - ordered by SHA-1.</code></pre>  + ordered by SHA-1.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Before talking about the ordering for the sliding delta  +<pre><tt>Before talking about the ordering for the sliding delta   window, let's talk about the recency order. That's more  -important in one way.</code></pre>  +important in one way.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Right, but if all you want is a working way to pack things  +<pre><tt>&lt;njs`&gt; Right, but if all you want is a working way to pack things   together, you could just use cat and save yourself some  - trouble...</code></pre>  + trouble...</tt></pre>   </div></div>   <div class="paragraph"><p>Waaait for it&#8230;.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; The recency ordering (which is basically: put objects  +<pre><tt>&lt;linus&gt; The recency ordering (which is basically: put objects   _physically_ into the pack in the order that they are  - "reachable" from the head) is important.</code></pre>  + "reachable" from the head) is important.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; okay</code></pre>  +<pre><tt>&lt;njs`&gt; okay</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; It's important because that's the thing that gives packs  +<pre><tt>&lt;linus&gt; It's important because that's the thing that gives packs   good locality. It keeps the objects close to the head (whether   they are old or new, but they are _reachable_ from the head)   at the head of the pack. So packs actually have absolutely  - _wonderful_ IO patterns.</code></pre>  + _wonderful_ IO patterns.</tt></pre>   </div></div>   <div class="paragraph"><p>Read that again, because it is important.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; But recency ordering is totally useless for deciding how  +<pre><tt>&lt;linus&gt; But recency ordering is totally useless for deciding how   to actually generate the deltas, so the delta ordering is  - something else.</code></pre>  + something else.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>The delta ordering is (wait for it):  +<pre><tt>The delta ordering is (wait for it):   - first sort by the "basename" of the object, as defined by   the name the object was _first_ reached through when   generating the object list   - within the same basename, sort by size of the object  -- but always sort different types separately (commits first).</code></pre>  +- but always sort different types separately (commits first).</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>That's not exactly it, but it's very close.</code></pre>  +<pre><tt>That's not exactly it, but it's very close.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; The "_first_ reached" thing is not too important, just you  +<pre><tt>&lt;njs`&gt; The "_first_ reached" thing is not too important, just you   need some way to break ties since the same objects may be  - reachable many ways, yes?</code></pre>  + reachable many ways, yes?</tt></pre>   </div></div>   <div class="paragraph"><p>And as if to clarify:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; The point is that it's all really just any random  +<pre><tt>&lt;linus&gt; The point is that it's all really just any random   heuristic, and the ordering is totally unimportant for   correctness, but it helps a lot if the heuristic gives   "clumping" for things that are likely to delta well against  - each other.</code></pre>  + each other.</tt></pre>   </div></div>   <div class="paragraph"><p>It is an important point, so secretly, I did my own research and have   included my results below. To be fair, it has changed some over time.  @@ -921,7 +923,7 @@  from The Git IRC Logs on my father&#8217;s birthday, March 1:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;gitster&gt; The quote from the above linus should be rewritten a  +<pre><tt>&lt;gitster&gt; The quote from the above linus should be rewritten a   bit (wait for it):   - first sort by type. Different objects never delta with   each other.  @@ -931,81 +933,81 @@  - then if we are doing "thin" pack, the objects we are _not_   going to pack but we know about are sorted earlier than   other objects.  - - and finally sort by size, larger to smaller.</code></pre>  + - and finally sort by size, larger to smaller.</tt></pre>   </div></div>   <div class="paragraph"><p>In one swell-foop, clarification and obscurification! Nonetheless,   authoritative. Cryptic, yet concise. It even solicits notions of   quotes from The Source Code. Clearly, more study is needed.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;gitster&gt; That's the sort order. What this means is:  +<pre><tt>&lt;gitster&gt; That's the sort order. What this means is:   - we do not delta different object types.   - we prefer to delta the objects with the same full path, but   allow files with the same name from different directories.   - we always prefer to delta against objects we are not going   to send, if there are some.   - we prefer to delta against larger objects, so that we have  - lots of removals.</code></pre>  + lots of removals.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>The penultimate rule is for "thin" packs. It is used when  -the other side is known to have such objects.</code></pre>  +<pre><tt>The penultimate rule is for "thin" packs. It is used when  +the other side is known to have such objects.</tt></pre>   </div></div>   <div class="paragraph"><p>There it is again. "Thin" packs. I&#8217;m thinking to myself, "What   is a <em>thin</em> pack?" So I ask:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;jdl&gt; What is a "thin" pack?</code></pre>  +<pre><tt>&lt;jdl&gt; What is a "thin" pack?</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;gitster&gt; Use of --objects-edge to rev-list as the upstream of  - pack-objects. The pack transfer protocol negotiates that.</code></pre>  +<pre><tt>&lt;gitster&gt; Use of --objects-edge to rev-list as the upstream of  + pack-objects. The pack transfer protocol negotiates that.</tt></pre>   </div></div>   <div class="paragraph"><p>Woo hoo! Cleared that <em>right</em> up!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;gitster&gt; There are two directions - push and fetch.</code></pre>  +<pre><tt>&lt;gitster&gt; There are two directions - push and fetch.</tt></pre>   </div></div>   <div class="paragraph"><p>There! Did you see it? It is not <em>"push" and "pull"</em>! How often the   confusion has started here. So casually mentioned, too!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;gitster&gt; For push, git-send-pack invokes git-receive-pack on the  +<pre><tt>&lt;gitster&gt; For push, git-send-pack invokes git-receive-pack on the   other end. The receive-pack says "I have up to these commits".   send-pack looks at them, and computes what are missing from  - the other end. So "thin" could be the default there.</code></pre>  + the other end. So "thin" could be the default there.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>In the other direction, fetch, git-fetch-pack and  +<pre><tt>In the other direction, fetch, git-fetch-pack and   git-clone-pack invokes git-upload-pack on the other end  -(via ssh or by talking to the daemon).</code></pre>  +(via ssh or by talking to the daemon).</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>There are two cases: fetch-pack with -k and clone-pack is one,  +<pre><tt>There are two cases: fetch-pack with -k and clone-pack is one,   fetch-pack without -k is the other. clone-pack and fetch-pack   with -k will keep the downloaded packfile without expanded, so   we do not use thin pack transfer. Otherwise, the generated  -pack will have delta without base object in the same pack.</code></pre>  +pack will have delta without base object in the same pack.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>But fetch-pack without -k will explode the received pack into  +<pre><tt>But fetch-pack without -k will explode the received pack into   individual objects, so we automatically ask upload-pack to  -give us a thin pack if upload-pack supports it.</code></pre>  +give us a thin pack if upload-pack supports it.</tt></pre>   </div></div>   <div class="paragraph"><p>OK then.</p></div>   <div class="paragraph"><p>Uh.</p></div>   <div class="paragraph"><p>Let&#8217;s return to the previous conversation still in progress.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; and "basename" means something like "the tail of end of  +<pre><tt>&lt;njs`&gt; and "basename" means something like "the tail of end of   path of file objects and dir objects, as per basename(3), and   we just declare all commit and tag objects to have the same  - basename" or something?</code></pre>  + basename" or something?</tt></pre>   </div></div>   <div class="paragraph"><p>Luckily, that too is a point that gitster clarified for us!</p></div>   <div class="paragraph"><p>If I might add, the trick is to make files that <em>might</em> be similar be  @@ -1019,73 +1021,73 @@  content no matter what directory they live in.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; I played around with different delta algorithms, and with  +<pre><tt>&lt;linus&gt; I played around with different delta algorithms, and with   making the "delta window" bigger, but having too big of a   sliding window makes it very expensive to generate the pack:  - you need to compare every object with a _ton_ of other objects.</code></pre>  + you need to compare every object with a _ton_ of other objects.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>There are a number of other trivial heuristics too, which  +<pre><tt>There are a number of other trivial heuristics too, which   basically boil down to "don't bother even trying to delta this   pair" if we can tell before-hand that the delta isn't worth it   (due to size differences, where we can take a previous delta   result into account to decide that "ok, no point in trying  -that one, it will be worse").</code></pre>  +that one, it will be worse").</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>End result: packing is actually very size efficient. It's  +<pre><tt>End result: packing is actually very size efficient. It's   somewhat CPU-wasteful, but on the other hand, since you're   really only supposed to do it maybe once a month (and you can  -do it during the night), nobody really seems to care.</code></pre>  +do it during the night), nobody really seems to care.</tt></pre>   </div></div>   <div class="paragraph"><p>Nice Engineering Touch, there. Find when it doesn&#8217;t matter, and   proclaim it a non-issue. Good style too!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; So, just to repeat to see if I'm following, we start by  +<pre><tt>&lt;njs`&gt; So, just to repeat to see if I'm following, we start by   getting a list of the objects we want to pack, we sort it by   this heuristic (basically lexicographically on the tuple  - (type, basename, size)).</code></pre>  + (type, basename, size)).</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Then we walk through this list, and calculate a delta of  +<pre><tt>Then we walk through this list, and calculate a delta of   each object against the last n (tunable parameter) objects,  -and pick the smallest of these deltas.</code></pre>  +and pick the smallest of these deltas.</tt></pre>   </div></div>   <div class="paragraph"><p>Vastly simplified, but the essence is there!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Correct.</code></pre>  +<pre><tt>&lt;linus&gt; Correct.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; And then once we have picked a delta or fulltext to  +<pre><tt>&lt;njs`&gt; And then once we have picked a delta or fulltext to   represent each object, we re-sort by recency, and write them  - out in that order.</code></pre>  + out in that order.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Yup. Some other small details:</code></pre>  +<pre><tt>&lt;linus&gt; Yup. Some other small details:</tt></pre>   </div></div>   <div class="paragraph"><p>And of course there is the "Other Shoe" Factor too.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; - We limit the delta depth to another magic value (right  - now both the window and delta depth magic values are just "10")</code></pre>  +<pre><tt>&lt;linus&gt; - We limit the delta depth to another magic value (right  + now both the window and delta depth magic values are just "10")</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Hrm, my intuition is that you'd end up with really _bad_ IO  +<pre><tt>&lt;njs`&gt; Hrm, my intuition is that you'd end up with really _bad_ IO   patterns, because the things you want are near by, but to   actually reconstruct them you may have to jump all over in  - random ways.</code></pre>  + random ways.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; - When we write out a delta, and we haven't yet written  +<pre><tt>&lt;linus&gt; - When we write out a delta, and we haven't yet written   out the object it is a delta against, we write out the base   object first. And no, when we reconstruct them, we actually   get nice IO patterns, because:  @@ -1093,58 +1095,58 @@  - we actively try to generate deltas from a larger object to a   smaller one   - this means that the top-of-tree very seldom has deltas  - (i.e. deltas in _practice_ are "backwards deltas")</code></pre>  + (i.e. deltas in _practice_ are "backwards deltas")</tt></pre>   </div></div>   <div class="paragraph"><p>Again, we should reread that whole paragraph. Not just because   Linus has slipped Linus&#8217;s Law in there on us, but because it is   important. Let&#8217;s make sure we clarify some of the points here:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; So the point is just that in practice, delta order and  - recency order match each other quite well.</code></pre>  +<pre><tt>&lt;njs`&gt; So the point is just that in practice, delta order and  + recency order match each other quite well.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Yes. There's another nice side to this (and yes, it was  +<pre><tt>&lt;linus&gt; Yes. There's another nice side to this (and yes, it was   designed that way ;):   - the reason we generate deltas against the larger object is  - actually a big space saver too!</code></pre>  + actually a big space saver too!</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Hmm, but your last comment (if "we haven't yet written out  +<pre><tt>&lt;njs`&gt; Hmm, but your last comment (if "we haven't yet written out   the object it is a delta against, we write out the base object   first"), seems like it would make these facts mostly   irrelevant because even if in practice you would not have to   wander around much, in fact you just brute-force say that in  - the cases where you might have to wander, don't do that :-)</code></pre>  + the cases where you might have to wander, don't do that :-)</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Yes and no. Notice the rule: we only write out the base  +<pre><tt>&lt;linus&gt; Yes and no. Notice the rule: we only write out the base   object first if the delta against it was more recent. That   means that you can actually have deltas that refer to a base   object that is _not_ close to the delta object, but that only  - happens when the delta is needed to generate an _old_ object.</code></pre>  + happens when the delta is needed to generate an _old_ object.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; See?</code></pre>  +<pre><tt>&lt;linus&gt; See?</tt></pre>   </div></div>   <div class="paragraph"><p>Yeah, no. I missed that on the first two or three readings myself.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; This keeps the front of the pack dense. The front of the  +<pre><tt>&lt;linus&gt; This keeps the front of the pack dense. The front of the   pack never contains data that isn't relevant to a "recent"   object. The size optimization comes from our use of xdelta   (but is true for many other delta algorithms): removing data  - is cheaper (in size) than adding data.</code></pre>  + is cheaper (in size) than adding data.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>When you remove data, you only need to say "copy bytes n--m".  +<pre><tt>When you remove data, you only need to say "copy bytes n--m".   In contrast, in a delta that _adds_ data, you have to say "add  -these bytes: 'actual data goes here'"</code></pre>  +these bytes: 'actual data goes here'"</tt></pre>   </div></div>   <div class="ulist"><ul>   <li>  @@ -1153,7 +1155,7 @@  </p>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Uhhuh. I hope I didn't blow njs` mind.</code></pre>  +<pre><tt>&lt;linus&gt; Uhhuh. I hope I didn't blow njs` mind.</tt></pre>   </div></div>   </li>   <li>  @@ -1162,7 +1164,7 @@  </p>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;pasky&gt; :)</code></pre>  +<pre><tt>&lt;pasky&gt; :)</tt></pre>   </div></div>   </li>   </ul></div>  @@ -1170,7 +1172,7 @@  <div class="paragraph"><p>And as if njs` was expected to be omniscient:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; njs - did you miss anything?</code></pre>  +<pre><tt>&lt;linus&gt; njs - did you miss anything?</tt></pre>   </div></div>   <div class="paragraph"><p>OK, I&#8217;ll spell it out. That&#8217;s Geek Humor. If njs` was not actually   connected for a little bit there, how would he know if missed anything  @@ -1178,168 +1180,170 @@  humor! Well noted!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Stupid router. Or gremlins, or whatever.</code></pre>  +<pre><tt>&lt;njs`&gt; Stupid router. Or gremlins, or whatever.</tt></pre>   </div></div>   <div class="paragraph"><p>It&#8217;s a cheap shot at Cisco. Take 'em when you can.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Yes and no. Notice the rule: we only write out the base  - object first if the delta against it was more recent.</code></pre>  +<pre><tt>&lt;njs`&gt; Yes and no. Notice the rule: we only write out the base  + object first if the delta against it was more recent.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>I'm getting lost in all these orders, let me re-read :-)  +<pre><tt>I'm getting lost in all these orders, let me re-read :-)   So the write-out order is from most recent to least recent?   (Conceivably it could be the opposite way too, I'm not sure if   we've said) though my connection back at home is logging, so I  -can just read what you said there :-)</code></pre>  +can just read what you said there :-)</tt></pre>   </div></div>   <div class="paragraph"><p>And for those of you paying attention, the Omniscient Trick has just   been detailed!</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Yes, we always write out most recent first</code></pre>  +<pre><tt>&lt;linus&gt; Yes, we always write out most recent first</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; And, yeah, I got the part about deeper-in-history stuff  - having worse IO characteristics, one sort of doesn't care.</code></pre>  +<pre><tt>&lt;njs`&gt; And, yeah, I got the part about deeper-in-history stuff  + having worse IO characteristics, one sort of doesn't care.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; With the caveat that if the "most recent" needs an older  +<pre><tt>&lt;linus&gt; With the caveat that if the "most recent" needs an older   object to delta against (hey, shrinking sometimes does  - happen), we write out the old object with the delta.</code></pre>  + happen), we write out the old object with the delta.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; (if only it happened more...)</code></pre>  +<pre><tt>&lt;njs`&gt; (if only it happened more...)</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Anyway, the pack-file could easily be denser still, but  +<pre><tt>&lt;linus&gt; Anyway, the pack-file could easily be denser still, but   because it's used both for streaming (the Git protocol) and  - for on-disk, it has a few pessimizations.</code></pre>  + for on-disk, it has a few pessimizations.</tt></pre>   </div></div>   <div class="paragraph"><p>Actually, it is a made-up word. But it is a made-up word being   used as setup for a later optimization, which is a real word:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; In particular, while the pack-file is then compressed,  +<pre><tt>&lt;linus&gt; In particular, while the pack-file is then compressed,   it's compressed just one object at a time, so the actual   compression factor is less than it could be in theory. But it   means that it's all nice random-access with a simple index to  - do "object name-&gt;location in packfile" translation.</code></pre>  + do "object name-&gt;location in packfile" translation.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; I'm assuming the real win for delta-ing large-&gt;small is  - more homogeneous statistics for gzip to run over?</code></pre>  +<pre><tt>&lt;njs`&gt; I'm assuming the real win for delta-ing large-&gt;small is  + more homogeneous statistics for gzip to run over?</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>(You have to put the bytes in one place or another, but  -putting them in a larger blob wins on compression)</code></pre>  +<pre><tt>(You have to put the bytes in one place or another, but  +putting them in a larger blob wins on compression)</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Actually, what is the compression strategy -- each delta  +<pre><tt>Actually, what is the compression strategy -- each delta   individually gzipped, the whole file gzipped, somewhere in  -between, no compression at all, ....?</code></pre>  +between, no compression at all, ....?</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Right.</code></pre>  +<pre><tt>Right.</tt></pre>   </div></div>   <div class="paragraph"><p>Reality IRC sets in. For example:</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;pasky&gt; I'll read the rest in the morning, I really have to go  +<pre><tt>&lt;pasky&gt; I'll read the rest in the morning, I really have to go   sleep or there's no hope whatsoever for me at the today's  - exam... g'nite all.</code></pre>  + exam... g'nite all.</tt></pre>   </div></div>   <div class="paragraph"><p>Heh.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; pasky: g'nite</code></pre>  +<pre><tt>&lt;linus&gt; pasky: g'nite</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; pasky: 'luck</code></pre>  +<pre><tt>&lt;njs`&gt; pasky: 'luck</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Right: large-&gt;small matters exactly because of compression  +<pre><tt>&lt;linus&gt; Right: large-&gt;small matters exactly because of compression   behaviour. If it was non-compressed, it probably wouldn't make  - any difference.</code></pre>  + any difference.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; yeah</code></pre>  +<pre><tt>&lt;njs`&gt; yeah</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Anyway: I'm not even trying to claim that the pack-files  +<pre><tt>&lt;linus&gt; Anyway: I'm not even trying to claim that the pack-files   are perfect, but they do tend to have a nice balance of  - density vs ease-of use.</code></pre>  + density vs ease-of use.</tt></pre>   </div></div>   <div class="paragraph"><p>Gasp! OK, saved. That&#8217;s a fair Engineering trade off. Close call!   In fact, Linus reflects on some Basic Engineering Fundamentals,   design options, etc.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; More importantly, they allow Git to still _conceptually_  - never deal with deltas at all, and be a "whole object" store.</code></pre>  +<pre><tt>&lt;linus&gt; More importantly, they allow Git to still _conceptually_  + never deal with deltas at all, and be a "whole object" store.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Which has some problems (we discussed bad huge-file  +<pre><tt>Which has some problems (we discussed bad huge-file   behaviour on the Git lists the other day), but it does mean   that the basic Git concepts are really really simple and  -straightforward.</code></pre>  +straightforward.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>It's all been quite stable.</code></pre>  +<pre><tt>It's all been quite stable.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Which I think is very much a result of having very simple  +<pre><tt>Which I think is very much a result of having very simple   basic ideas, so that there's never any confusion about what's  -going on.</code></pre>  +going on.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>Bugs happen, but they are "simple" bugs. And bugs that  +<pre><tt>Bugs happen, but they are "simple" bugs. And bugs that   actually get some object store detail wrong are almost always  -so obvious that they never go anywhere.</code></pre>  +so obvious that they never go anywhere.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; Yeah.</code></pre>  +<pre><tt>&lt;njs`&gt; Yeah.</tt></pre>   </div></div>   <div class="paragraph"><p>Nuff said.</p></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;linus&gt; Anyway. I'm off for bed. It's not 6AM here, but I've got  +<pre><tt>&lt;linus&gt; Anyway. I'm off for bed. It's not 6AM here, but I've got   three kids, and have to get up early in the morning to send  - them off. I need my beauty sleep.</code></pre>  + them off. I need my beauty sleep.</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; :-)</code></pre>  +<pre><tt>&lt;njs`&gt; :-)</tt></pre>   </div></div>   <div class="literalblock">   <div class="content">  -<pre><code>&lt;njs`&gt; appreciate the infodump, I really was failing to find the  - details on Git packs :-)</code></pre>  +<pre><tt>&lt;njs`&gt; appreciate the infodump, I really was failing to find the  + details on Git packs :-)</tt></pre>   </div></div>   <div class="paragraph"><p>And now you know the rest of the story.</p></div>   </div>  +</div>  +</div>   <div id="footnotes"><hr /></div>   <div id="footer">   <div id="footer-text">  -Last updated 2013-09-12 16:24:44 PDT  +Last updated 2014-01-13 15:35:15 PST   </div>   </div>   </body>